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ABSTRACT:  We  quantify  uncertainty  in  complex  systems  by  a  flexible,  nonparametric  framework  for  esti¬ 
mating  probability  density  functions  of  output  quantities  of  interest.  The  framework  systematically  incorporates 
soft  information  about  the  system  from  engineering  judgement  and  experience  to  improve  the  estimates  and  en¬ 
sure  that  they  are  consistent  with  prior  knowledge.  The  framework  is  based  on  a  maximum  likelihood  criterion, 
with  epi-splines  facilitating  rapid  solution  of  the  resulting  optimization  problems.  In  four  numerical  examples 
with  few  realizations  of  the  system  output,  we  identify  the  main  features  of  output  densities  even  for  nonsmooth 
and  discontinuous  system  function  and  high-dimensional  inputs. 


1  INTRODUCTION 

We  address  uncertainty  quantification  (UQ)  in  com¬ 
plex  systems  by  a  flexible,  nonparametric  framework 
for  estimating  probability  density  functions  of  ran¬ 
dom  output  quantities  of  interests.  The  framework 
systematically  incorporates  hard  information  derived 
from  physics-based  sensors,  field  test  data,  and  com¬ 
puter  simulations  as  well  as  soft  information  from  en¬ 
gineering  knowledge  and  experience.  Our  ability  to 
account  for  soft  information  about  the  system  over¬ 
come,  in  part,  the  poor  accuracy  traditionally  expe¬ 
rienced  when  applying  classical  statistical  methods 
to  UQ.  The  framework  is  based  on  epi-splines  for 
consistent  approximation  of  infinite-dimensional  op¬ 
timization  problems  arising  in  the  estimation  process 
as  developed  by  Royset  and  Wets  (2013). 

UQ  aims  to  characterize  a  random  vector  Y  that  de¬ 
scribes  the  output  (response)  from  some  system,  mod¬ 
eled  by  a  function  g,  subject  to  input  given  by  a  ran¬ 
dom  vector  V.  We  use  uppercase  to  denote  random 
variables  and  lowercase  for  their  realizations.  Bold¬ 
face  type  indicates  a  vector.  Specifically, 

Y  =  fi'(V). 

While  the  distribution  of  V  may  be  known,  or  as¬ 


sumed  known,  the  distribution  of  Y  isn’t  easily  avail¬ 
able  due  to  the  complexity  of  g.  In  practice,  g  may 
be  given  in  terms  of  the  solution  of  differential  and 
algebraic  equations,  or  other  computationally  intense 
problems.  In  this  case,  the  time  required  to  evaluate 
g  at  a  single  point  v  limits  the  number  of  realiza¬ 
tions  of  Y  one  can  generate.  Moreover,  in  the  case 
of  physical  experiments  and  other  situations,  we  may 
only  have  available  a  sample  of  input-output  realiza¬ 
tions  (v^y1),  (v2,y2),  ...,  (vn,yn),  with  yJ  =  g(\J), 
of  finite,  and  typically  small,  size  that  cannot  be  aug¬ 
mented  without  exorbitant  costs.  In  the  absence  of  a 
large  set  of  realizations  of  Y,  empirical  estimates  of 
the  mean,  standard  deviation,  and  other  descriptions 
of  Y  tend  to  be  highly  inaccurate  and  may  lead  to  sig¬ 
nificant  underestimation  of  the  variability.  Since  the 
variability  of  the  random  vector  V  may  be  substan¬ 
tial,  sensitivity  and  perturbation  methods  (Ghanem 
and  Spanos  1991,  Kleiber  et  al.  1992)  applied  near 
the  mean  of  V  may  be  highly  inaccurate  as  well. 

Polynomial  chaos  expansion,  stochastic  Galerkin, 
and  stochastic  collocation  methods  are  frequently 
used  for  UQ  in  an  effort  to  overcome  these  diffi¬ 
culties  (Ghanem  and  Spanos  1991,  Xiu  and  Kami- 
adakis  2002,  Babuska  et  al.  2004,  Ganapathysubra- 
manian  and  Zabaras  2007,  Nobile  et  al.  2008,  Eldred 
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et  al.  2011).  These  methods  build  a  model  of  g  us¬ 
ing  polynomial  functions  and  then  estimate  moments 
of  Y  using  that  model  and  the  (assumed)  knowledge 
of  the  distribution  of  Y.  Under  certain  assumptions, 
the  model  may  approach  g  exponential  fast  as  the  fi¬ 
delity  of  the  model  tends  to  infinity,  for  example  in 
the  mean- square  sense.  A  main  difficulty  with  expan¬ 
sion  methods  is  that  the  accuracy  deteriorates  as  the 
dimensionality  of  V  grows,  g  becomes  less  ‘smooth,’ 
and  the  distribution  of  V  deviates  from  those  compat¬ 
ible  with  polynomials  in  the  Wiener-Askey  scheme 
(Xiu  and  Kamiadakis  2002).  Moreover,  there  is  no 
systematic  way  of  including  soft  information  about 
the  distribution  of  Y,  for  example  from  extensive  ex¬ 
perience  with  similar  systems  and  specific  properties 
of  g,  in  the  estimation  process.  These  difficulties  are 
present  even  if  Y  is  scalar  valued.  Smolyak  sparse- 
grid  approaches  (Xiu  and  Hesthaven  2005,  Gana- 
pathysubramanian  and  Zabaras  2007,  Nobile  et  al. 
2008),  separated  representations  based  on  alternating 
least-squares  approximation  techniques  (Doostan  and 
Iaccarino  2009),  and  related  approaches  are  steps  in 
the  direction  towards  handling  moderate  dimensions 
of  V,  but  still  significant  challenges  remain  when  han¬ 
dling  high-dimensional  cases.  We  refer  to  Helton  and 
Pilch  (201 1)  for  an  overview  of  recent  studies  on  UQ. 

In  this  paper,  we  propose  an  approach  to  UQ  that’s 
insensitive  to  the  dimension  of  V,  doesn’t  rely  on 
the  smoothness  of  g,  and  handles  an  arbitrary  density 
for  V.  We  estimate  the  probability  density  function 
of  Y  (assuming  it  exists)  using  the  available  sample 
{yJ}"=i,  which  may  be  small,  and  all  available  soft 
information  about  Y.  Such  information  may  come  in 
the  form  of  knowledge  about  the  support  of  the  prob¬ 
ability  density  function,  its  continuity,  smoothness, 
unimodality,  monotonicity,  moments,  and  other  char¬ 
acteristics.  Naturally,  if  soft  information  could  be  in¬ 
cluded  in  the  estimation  process,  it  might  greatly  re¬ 
duce  estimator  error  and  variance,  and  enhance  vi¬ 
sual  appeal  of  the  estimates.  This  effect  is  especially 
prominent  in  the  case  of  few  observations  of  Y.  While 
our  general  approach  applies  to  joint  probability  den¬ 
sity  functions,  we  here  focus  on  a  single  output  quan¬ 
tify  and  the  estimation  of  its  density.  Therefore  in  the 
remaining  of  the  paper,  we  let  Y  be  scalar  valued. 

The  consideration  of  soft  information  in  density  es¬ 
timation  is  hardly  new.  Classical  parametric  estima¬ 
tion  selects  a  parametric  family  of  densities,  presum¬ 
ably  based  on  soft  information,  and  determines  the 
‘best’  estimate  within  that  family.  Bayesian  estima¬ 
tion  makes  use  of  prior  soft  information  extensively, 
but  results  extend  much  beyond  that  premise  (Wahba 
1981,  Van  de  Geer  1987,  Thompson  and  Tapia  1990, 
Wets  1991,  Dupacova  1992,  Samaniego  and  Reneau 
1994,  Geyer  1994,  Groenenboom  et  al.  2001).  In 
nonparametric  density  estimation,  practitioners  may 
also  adjust  estimates  based  on  their  experience  in  an 
ad-hoc  manner.  Desirable  density  estimators  can  be 
achieved  by  means  of  penalties  (Good  and  Gaskin 


1971,  de  Montricher  et  al.  1975,  Leonard  1978,  Klo- 
nias  1982,  Silverman  1982).  While  in  principle  many 
types  of  constraints  in  the  estimation  problem  can  be 
represented  by  penalty  terms,  the  equivalence  of  such 
reformulations  depends  on  the  successful  selection  of 
multiplier  and  penalty  parameters  which  is  nontrivial 
in  practice.  We  view  density  estimation  problems  with 
soft  information  as  a  constrained  stochastic  optimiza¬ 
tion  problem,  where  the  goal  is  to  determine  a  den¬ 
sity  that  maximizes  the  likelihood  function  generated 
by  a  sample  subject  to  constraints  on  the  density  de¬ 
rived  from  the  soft  information.  While  the  connection 
between  density  estimation  and  optimization  has  long 
been  recognized  (see  Thompson  and  Tapia  (1990)  and 
Wets  (1991)  and  more  recently  Dong  and  Wets  (2007) 
and  Casey  and  Wets  (2013)),  our  framework  offers 
flexibility  in  the  inclusion  of  soft  information  and  a 
solid  theoretical  foundation  of  asymptotic  results;  see 
Royset  and  Wets  (2013)  for  details. 

The  remaining  of  the  paper  is  organized  as  follows. 
Section  2  defines  epi-splines  and  the  maximum  like¬ 
lihood  problem.  Section  3  discusses  soft  information 
and  its  implementation.  Section  4  illustrates  the  ap¬ 
proach  with  four  numerical  examples.  The  paper  ends 
with  conclusions  in  Section  5. 

2  EXPONENTIAL  EPI-SPLINE  ESTIMATE 

Given  a  sample  y1,y2:...:yn  from  Y,  we  consider  an 
estimate 

r(y)  =  exp(sn(y)),yeM, 

of  the  density  of  Y,  where  sn  is  an  appropriately  se¬ 
lected  epi-spline  in  view  of  the  sample  and  other  con¬ 
siderations.  Naturally,  we  refer  to  fn  as  an  exponen¬ 
tial  epi-spline  estimate.  As  we  see  below,  we  deter¬ 
mine  sn  by  maximizing  the  likelihood  function  asso¬ 
ciated  with  the  sample,  while  simultaneously  ensuring 
that  constraints  derived  form  soft  information  are  met. 
The  use  of  the  exponential  function  ensures  a  nonneg¬ 
ative  fn,  which,  of  course,  is  a  basic  requirement  for 
a  density.  While  finding  an  arbitrary  function  sn  in¬ 
volves  an  infinite-dimensional  optimization  problem, 
the  restriction  to  an  epi-spline  allows  the  use  of  a  finite 
number  of  parameters  and  still  maintains  flexibility  to 
approximate  essentially  any  density.  Once  fn  is  de¬ 
termined,  the  estimation  of  moments  of  Y  and  other 
quantifies  is  easily  carried  out  by  numerical  integra¬ 
tion. 

We  observe  that  the  estimation  of  fn  implicitly  ac¬ 
counts  for  the  system  function  g  in  two  ways.  The 
first  way  relates  to  the  generation  of  a  sample.  A 
sample  point  y]  is  typically  generated  by  sampling 
or  carefully  selecting  an  input  vJ ,  evaluating  g(vJ), 
and  then  setting  yJ  =  g(vJ  ).  Consequently,  the  sam¬ 
ple  y 1,y2,  ■■■,yn  includes  ‘hard’  information  about  g. 
We  don’t  examine  the  design  of  experiments,  i.e.,  the 
selection  of  v  at  which  to  evaluate  g.  However,  a  den¬ 
sity  established  through  the  present  framework  for  a 


Figure  1:  Illustration  of  epi-spline. 

given  sample  provides  an  informed  basis  for  select¬ 
ing  a  next  sample  point.  The  second  way  derives  from 
soft  information.  As  we  see  in  Section  3,  soft  informa¬ 
tion  about  g  may  translate  into  soft  information  on  Y, 
which  is  incorporated  as  constraints  in  the  likelihood 
maximization.  In  contrast  to  expansion  methods  for 
UQ,  we  don’t  attempt  to  build  a  model  of  g  directly. 

2.1  Epi-Splines 

We  follow  Royset  and  Wets  (2013)  and  define  an 
epi-spline  in  terms  of  its  order ,  given  by  a  non¬ 
negative  integer  p,  its  number  of  partitions  N,  and 
its  mesh  d  e  VN,  where  VN  consists  all  N  +  1- 
dimensional  vectors  d  =  (d0,di, ....  d^)  with  dk-i  < 
dk,  k  =  1,2 ,  Given  these  quantities,  an  epi- 

spline  is  a  real- valued  function  defined  on  the  closed 
interval  [d0,dN]  that  are  polynomial  of  degree  p  in 
each  segment  (dk_i,dk),  k  =  1,2, ...,  N,  and  that  are 
finite  valued  at  d0,  d\,  ...,  dN.  For  y  £  [d0,  dN],  we  as¬ 
sign  oo  to  the  epi-spline.  Figure  1  gives  an  example 
of  an  epi-spline. 

It’s  clear  from  the  definition  that  an  epi-spline  is 
uniquely  defined  by  a  finite  number  of  parameters. 
Each  segment  (4-i,  dk)  requires  p  +  1  parameters  to 
define  a  polynomial  of  degree  p.  Since  there  are  N 
segments  and  the  N  +  1  mesh  points  are  defined  sepa¬ 
rately,  the  total  number  of  parameters  is  (p  +  2)  N  +  1. 
We  organize  these  parameters  into  a  vector 

OL  (^0 ,  ^1 ,  •  •  • ,  ^7V ;  ^1 ,  ^2 ;  •  •  •  >  j 

where  sk,  k  =  0, 1, ...,  N,  are  scalars  giving  the  values 
of  an  epi-spline  at  the  mesh  points  and 

Uk,l  j  •  ••  j  (^k,p)  j 

is  a  (p  +  1) -dimensional  vector  representing  the  co¬ 
efficients  in  the  polynomial  of  the  kth  segment,  k  = 
1, 2, ...,  N.  We  refer  to  a  as  the  epi-spline  parameter. 
We  use  the  notation  yk  =  y  —  dk-u  k  =  1,2 , 
and  denote  by  0*,  the  /.  -dimensional  zero  vector,  with 


Oo  being  a  term  that’s  omitted.  Then,  every  epi-spline 
s  is  of  the  form 

s(y)  =  (c{y),a),  ye[d0,dN], 

where  (-,  •)  denotes  the  dot  product  and  c  is  a  vector 
of  basis  functions.  Specifically, 

(0jv+i+(p+i)(fc— 1),  1  ,Vk,  0(p+i)W-fc)) 

,  v  _  <  if  y  e  (dk~i,dk),k  =  1,2 

{0k,  1,  Oa^— fc+(p+l)Af ) 

ify  =  dk,k  =  0,l,...,N. 

In  essence,  if  y  coincides  with  a  mesh  point,  c  simply 
assigns  s(y)  to  be  equal  to  the  component  of  a  corre¬ 
sponding  to  that  mesh  point.  If  y  e  (dk-i,  dk),  then  c 
selects  the  components  of  a  that  describe  the  polyno¬ 
mial  in  that  segment,  i.e., 

p 

s(y )  = 

i= 0 

It’s  clear  that  epi-splines  relate  to  classical  splines, 
but  we  allow  for  jumps  and  our  construction  is  dif¬ 
ferent.  An  epi-spline  of  any  order  p  can  approxi¬ 
mate  to  an  arbitrary  accuracy  any  lower-  and  upper- 
semicontinuous  function  by  increasing  the  number  of 
partitions  N  (Royset  and  Wets  2013).  Consequently, 
we  have  the  ability  to  approximate  essentially  any 
density  that  one  may  encounter  in  practice.  In  fact, 
many  common  density  functions  are  exactly  repre¬ 
sented  on  [d0,4v]-  For  example,  a  normal  density  is 
of  the  form  exp(— s (y)),  with  s  being  an  epi-spline 
of  order  2  for  any  number  of  partitions  N  and  mesh 
d.  An  exponential  density  is  of  the  form  exp(—s  (?/)), 
with  s  being  an  epi-spline  of  order  1,  for  any  num¬ 
ber  of  partitions  N  and  mesh  d,  with  d0  =  0.  Even 
more  densities,  such  as  the  lognormal  and  the  Pareto, 
are  exactly  represented  on  a  bounded  interval  after  a 
logarithmic  transformation. 

2.2  Maximum  Likelihood 

We  proceed  by  adopting  a  maximum  likelihood  cri¬ 
terion  to  determine  the  best  exponential  epi-spline 
that,  of  course,  integrates  to  unity.  For  realizations 
y1,  ....,2/n  drawn  from  a  density  f  (y)  =  exp (—s(y)), 
the  maximum  likelihood  function  takes  the  form 

n  n  n 

Y[f(yJ)  =  exp(~ 

j= 1  i=1  i=1 

where  the  last  equality  follows  from  the  assumption 
that  s  is  an  epi-spline  with  epi-spline  parameter  a. 
Since  optimal  solutions  are  invariant  to  a  logarithmic 
transformation,  we  prefer  instead  of  maximizing  this 
expression  to  maximize 

n 

3  =  1 


which  is  linear  in  a.  Moreover,  the  maximization  is 
subject  to  the  constraint 


rdN 


exp(-(c(y),a))dy  =  1, 


'd0 


(1) 


which  ensures  that  the  resulting  solution  actually  is 
a  density,  as  well  as  any  other  constraints  that  may 
be  generated  by  soft  information  as  discussed  further 
in  Section  3.  As  we  see  below,  the  latter  constraints 
are  typically  linear  and  consequently  easy  to  handle. 
Under  rather  general  assumptions,  (1)  can  be  relaxed 
to  a  <  constraint,  or  even  being  moved  to  the  objec¬ 
tive  function  as  a  ‘penalty,’  resulting  in  a  convex  opti¬ 
mization  problem  for  which  there  are  well-developed 
algorithms.  Even  if  the  equality  constraint  must  re¬ 
main,  the  maximum  likelihood  problem  can  usually 
be  solved  by  nonlinear  programming  algorithms  with 
little  difficulties;  see  Royset  and  Wets  (2013)  for  de¬ 
tails. 

An  optimal  solution  an  obtained  by  solving  the 
above  maximum  likelihood  problem  gives  an  estimate 
fn(y )  =  exp(-(c(r/),an)),  y  G  [d0,dN],  of  the  ‘true’ 
density.  Under  general  assumptions,  the  estimate  con¬ 
verges  to  the  true  one  as  the  sample  size  n  tends  to 
infinity.  Moreover,  moment  estimates  obtained  by  in¬ 
tegrating  fn  converge  also  to  the  correct  values.  We 
refer  to  Royset  and  Wets  (2013)  for  a  thorough  analy¬ 
sis. 


3  SOFT  INFORMATION 

We  next  discuss  how  to  specify  and  implement  soft 
information  about  Y.  We  consider  specific  examples, 
but  omit  a  discussion  of  tail-related  soft  information. 
The  latter  information  becomes  important  if  the  re¬ 
sulting  densities  are  to  be  used  in  reliability  analysis. 
However,  since  estimating  tail  behavior  requires 
special  care  in  the  presence  of  little  data,  we  here 
avoid  this  situation  and  instead  focus  on  the  ‘central’ 
characteristic  of  a  density. 

Support  bounds  and  mesh.  The  choice  of  mesh 
d  =  (d0,di,  ...,dN)  accounts  for  support  bounds  and 
do  and  djv  should,  ideally,  correspond  to  the  lower 
and  upper  bound  of  Y,  respectively.  If  V  is  bounded 
and  g  continuous,  then  d0  and  dN  can  be  determined, 
at  least  in  principle,  by  minimizing  and  maximizing 
g  over  the  possible  input  values,  respectively.  Even 
if  this  isn’t  possible,  insight  about  the  problem  may 
provide  estimates  of  lower  and  upper  bounds  on 
Y  that  can  serve  as  d0  and  d^-  The  mesh  is  often 
selected  to  be  uniform,  but  if  discontinuities  and 
intervals  with  steep  slopes  can  be  anticipated  other 
choices  may  be  preferred. 

Continuity.  We  ensure  that  an  exponential  epi-spline 


estimate  is  continuous  by  imposing  the  constraint 

p 

$k— 1  ®fc,Cb  $k  ^  ^  {dk  dk— i)  ',  k  1,2,...,  N. 

i= 0 

Of  course,  by  omitting  some  of  the  above  constraints, 
one  has  the  ability  to  ensure  continuity  at  some  but 
not  all  mesh  points.  All  of  these  constraints  are  linear. 

Smoothness.  We  restrict  the  search  to  r-times  contin¬ 
uously  differentiable  densities,  with  r  <  p,  by  impos¬ 
ing  the  conditions  for  continuity  and  the  linear  con¬ 
straints  for  k  —  1,2, ...,  N  —  1  ,j  =  1,2,  ...,r, 

p  j- 1 

^  ^  |  |  (i  dk—  i)  ®fc+r j. 

i=j  1=0 

Higher  order  smoothness  is  automatically  achieved 
if  these  constraints  are  imposed  with  r  =  p.  Again, 
selective  implementation  of  these  constraints  could 
be  a  useful  tool  in  practice. 

Pointwise  Fisher  information.  We  define  the  point- 
wise  Fisher  information  of  an  exponential  epi-spline 
density  /  at  y  to  be 


f(y)/f(y)  =  ~(c'(y),a)  =  -^iaKi(y  -  4-i)*  1 

i— 1 

for  y  G  (dk-i,dk).  Upper  and  lower  bounds  on  this 
quantity  result  in  linear  constraints.  The  constraints 
could  be  imposed  at  any  number  of  values  of  y,  but 
we  note  that  if  p  —  2  and  the  density  is  strongly 
unimodal,  as  describe  below,  and  continuously 
differentiable,  then  lower  bounds  on  f'(y)/ f(y)  at 
d\,  d2,  ...,  dN  suffices  to  ensure  that  the  constraints 
are  satisfied  for  all  y  G  [cZ0,  djv]-  Similarly,  an  upper 
bound  on  f'(y)/ f(y)  need  only  be  imposed  at  d0,  d\, 
...,  d jv— i •  In  applications,  one  can  obtain  guidance 
for  appropriate  values  of  upper  and  lower  bounds  on 
these  quantities  by  considering  standard  densities. 
For  example,  a  normal  density  with  mean  //  and 
standard  deviation  a  has  f(y)/  f(y)  =  —  (y  —  p)/cr2. 
For  the  exponential  density  with  parameter  A, 
f'(y)/f(y)  =  -A  for  all  y  >  0. 

Monotonicity.  We  achieve  a  nondecreasing  (nonin¬ 
creasing)  density  by  imposing  nonnegativity  (non¬ 
positivity)  on  f(y)/f(y)  for  all  y  G  (dk-i,dk),k  = 

1.2.. .., TV  as  described  above  as  well  as  for  k  = 

1.2. . ...1V, 


p 

$k—  1  ^  0,  $k  —  (_ )  dk— i)  . 

i= 0 

Again,  simplifications  arise,  for  example,  if  p  =  2  and 
the  density  is  strongly  unimodal.  Then,  it  suffices  to 


impose  that  ak>  1  +  2 afc)2(4  -  dk- 1)  <  0  (ofc, i  >  0), 

k  =  1,2, 

Strongly  unimodal.  We  recall  that  a  density  is 
strongly  unimodal  if  it’s  continuous  and  log-concave. 
Consequently,  if  f(y )  =  exp(-(c(y),  a)),  y  G 
[4,4v],  strong  unimodality  requires  that  (c(y),a)  is 
a  convex  function  in  y.  This  condition  is  ensured  if 
(i)  continuity  is  imposed  (see  above),  (ii)  for  k  = 
1,2 ,  ...,7V  —  1,  the  epi-spline’s  left  derivatives  at  dk 
is  no  larger  than  its  right  derivative,  i.e.,  for  k  = 
1)  2,  ...,N  —  1, 

p 

5  j ^Qfc,t(4  4— i) 

i= 1 

and  (iii)  on  each  (4_1?4),  fc  =  1,2,  ...,7V,  {c(y),a) 
is  a  convex  function  in  y,  i.e.,  for  k  —  1, 2, ...,  N,y  e 

(4—i  ?  4-)  j 

p 

-  !)<**: Ay  -  dk-iT~2  >  o. 

i=2 

Here,  the  obvious  interpretations  are  required  when 
p  =  0, 1.  The  latter  condition  simplifies  to  afci2  >  0, 
k  =  1,2,. ..,1V,  whenp  =  2. 

Bounds  on  density  values.  It’s  also  straightforward 
to  impose  pointwise  upper  and  lower  bounds  u(y )  and 

l(y)  on  the  value  of  f(y)  =  exp (—(c(y),a),  with  0  < 
l(y)  —  u(v )  <  It  suffices  to  set  for  y  G  (4-i;4)> 

p 

-log  1(0  >  4-0*  >  -logu(f) 

i=0 

and  for  y  =  4, 

-logZ(f)  >  Sfc  >  -logu(f),A;  =  0,1,. ..,1V. 

These  constraints  are  also  linear. 

Kullback-Leibler  divergence.  In  some  applications, 
it’s  useful  to  only  seek  densities  that  are  close  to  some 
known  density  in  some  sense.  ‘Distances’  between 
densities  are  conveniently  measured  by  the  Kullback- 
Leibler  divergence  from  a  density  f\  to  a  density  /2 
as  given  by 

4vl(/i||/2)  =  J  fi(y)  log  j^dy. 

Here  we  make  the  standard  interpretation  that 
4  log  4/4  —  0  when  4  =  0  regardless  of  the  value 
of  4  and  4  log  4/4  —  00  when  4  >  0  and  4  =  0. 
If  /2  is  an  exponential  epi-spline,  with  epi-spline  pa¬ 
rameter  a  and  /2(y)  >  0  whenever  f\  (y)  >  0,  then 

d'KL^fl  1 14) 


c(y)fi(y)dy,a )  + 


0-ogfily))fi(y)dy, 


where  y  is  the  part  of  the  real  fine  with  /1  (y)  >0.  (We 
observe  that  if  /2  (y)  =  0  for  more  than  a  countable 
number  of  y  with  j\  (y)  >  0,  then  dKL(fi\\h)  —  00 
and  the  comparison  is  less  interesting.)  Consequently, 
a  linear  constraint 

f  c{y)f1{y)dy,a^  +  J  (log  fi(y))fi(y)dy  <  k 

in  the  likelihood  maximization  ensures  that  the  result¬ 
ing  estimate  fn(y )  =  exp(— (c(y),  an))  is  no  further 
away  from  the  reference  density  f\  than  n,  as  mea¬ 
sured  by  the  Kullback-Leibler  divergence.  The  choice 
of  k  in  applications  can  be  informed  by  the  fact  that 
the  Kullback-Leibler  divergence  between  two  normal 
densities,  <4  and  4,  with  mean  and  standard  devia¬ 
tion  pi,ai  and  /j2,  cr2,  respectively,  takes  the  form 

1 

dKL(<f>l\\<h)  =  log(<72/<Ti)  -  - 

+  4/2  +  (z^1  —  ^2 )2  +  2 (of  +  —  yi))- 

For  two  exponential  densities,  f\  and  /2,  with  param¬ 
eters  Ai  and  A2,  respectively, 

4\l(/i||/2)  =  log(Ai/A2)  —  1  +  A2/Ai. 


Bounds  on  moments.  Soft  information  may  result  in 
a  constraint  on  the  j-th  moment  of  the  form 

fdN 

l<  /  yje~My)’a)dy<u, 

J  do 

where  l,u  G  JR,  l  <  u  are  given  constants.  The 
right-most  inequality  results  in  a  convex  constraint 
on  a,  while  the  left-most  in  a  nonconvex  constraint. 

Bounds  on  cumulative  distribution  functions.  If  the 

cumulative  distribution  function  at  7  G  [d0,dN]  of 
the  estimated  density  fn  must  fie  between  the  lower 
bound  l  and  the  upper  bound  u,  then  we  obtain  the  fol¬ 
lowing  constraints  for  the  maximum  likelihood  prob¬ 
lem: 

ri  rdN 

/  e-{c^’a)dy  <  u  and  /  e~{c^a)dy  <  1  _  l, 

J  do  J  7 

which  are  both  convex. 


Gradient  information.  If  the  input  v  is  a  scalar  and 
the  derivative  of  g  is  known  at  a  point  v,  then  the  den¬ 
sity  /  of  Y  must  satisfy 


f(d(v))  > 


04) 

W(v)\’ 


(2) 


where  0  is  the  density  of  V,  with  equality  holding 
if  g  is  strictly  increasing  or  strictly  decreasing. 
Consequently,  the  right-hand  side  provides  a  lower 
bound  on  the  density  estimate  fn  in  the  likelihood 
maximization  problem. 


4  NUMERICAL  EXAMPLES 


3 


We  illustrate  the  UQ  approach  with  four  examples, 
where  likelihood  maximization  problems  are  solved 
by  ‘fmincon’  in  Matlab  versions  7.10.0  and  con¬ 
tinuously  differentiable  epi-splines  of  order  2  with 
equally  spaced  mesh  are  used.  If  not  otherwise  stated, 
we  use  N  =  50  partitions.  If  there  is  no  soft  in¬ 
formation  about  support  bounds,  we  set  do  (djy)  to 
two  standard  errors  below  (above)  the  average  sam¬ 
ple  point.  The  Gauss-Legendre  quadrature  rule  with 
20  points  evaluates  the  integrals  over  each  segment 
(dk-i,dk)  when  needed.  (Only  a  highly  varying  den¬ 
sity  demands  such  accuracy,  but  we  default  to  this  as 
it  comes  at  little  computational  cost.)  For  comparison, 
we  also  compute  kernel  estimates  of  densities  using 
‘ksdensity’  in  Matlab,  with  the  default  normal  kernel. 


4.1  Example  1:  Bar  with  random  Young’s  modulus 
We  consider  the  differential  equation 

{E{x, y)u'{x))'  —  0,  re  (0,1)  (3) 

with  boundary  conditions  u(  0)  =  u0  and 
77(1,  v)-u'(l)  =  1  describing  the  displacement  of 
a  bar  with  unit  length,  where  E(x,  v)  is  Young’s  mod¬ 
ulus  at  x  under  realization  v  =  (vi ,  v2,  ■  ■■ ,  Vioo)  •  We  let 
E(x,\)  =  Vi  on  ((i  —  1) /100,  i/100),  i  —  1,2, ...,  100, 
i.e.,  piecewise  constant.  We  assume  that  v  is  unknown 
and  we  model  it  by  a  100-dimensional  random  vector 
V,  with  each  component  being  uniformly  distributed 
on  [0.1  0.5]  and  independent. 

Figure  2  illustrates  the  estimated  density  fn  of  the 
endpoint  displacement  u(  1),  which  is  random  due 
to  the  randomness  in  E,  based  on  a  sample  of  size 
n  =  10  of  that  displacement.  The  sample  is  obtained 
by  drawing  independently  10  realizations  of  V  and 
then  solving  (3),  and  is  illustrated  by  circles  in  Fig¬ 
ure  2.  The  solid  curve  gives  fn  under  the  additional 
soft  information  requiring  that  the  density  is  unimodal 
and  within  0.01  in  Kullback-Leibler  divergence  from 
a  normal  density  with  mean  4  and  standard  deviation 
0.2.  Moreover,  in  this  case  it’s  easy  to  determine  up¬ 
per  and  lower  bounds  of  the  endpoint  displacement, 
which  are  10  and  2,  respectively.  The  kernel  estimate, 
using  the  same  sample,  is  given  by  a  dashed  curve.  We 
note  that  the  kernel  estimate  is  unable  to  take  advan¬ 
tage  of  the  unimodal  and  Kullback-Leibler  soft  infor¬ 
mation.  (In  this  case,  the  kernel  estimate  is  unimodal 
by  coincidence.)  The  dotted  curve  in  Figure  2  gives 
the  ‘true’  density  as  estimated  by  a  sample  of  size  105. 
We  find  that  the  exponential  epi-spline  estimate  fn  is 
quite  close  to  the  true  density  even  with  only  10  sam¬ 
ple  points. 


. True  Density 

- Epi-Spline  Estimate 

- Kernel  Estimate 

O  Sample 


Figure  2:  Density  of  endpoint  displacement  in  Example  1. 

4.2  Example  2:  Linear  system  under  harmonic 
excitation 

The  next  example  is  taken  from  Chopra  (1995,  Sec¬ 
tion  12.1)  and  deals  with  a  two  degree -of-freedom  lin¬ 
ear  system  governed  by 

miUi{t)  +  (fci  +  k2)ui(t)  -  k2u2(t )  =  posinvt 

m2u2{t )  -  k2ui(t)  +  k2u2(t)  =  0 

where  mi  and  m2  are  the  masses  for  nodes  one  and 
two,  respectively,  ki  and  k2  are  the  corresponding 
stiffnesses,  pQ  is  the  excitation  magnitude,  v  the  exci¬ 
tation  frequency.  We  letpG  =  1,  mi  =  1/2,  m2  =  1/4, 
ki  =  1,  and  k2  =  1/2.  Then,  the  natural  frequencies  of 
the  first  and  second  nodes  are  u>i  =  1  and  u>2  =  2.  We 
concentrate  on  the  displacement  of  the  second  node, 
which  in  steady-state  is  u2(t)  =  u2osinvt,  with 

U2°  =  (1  —  n2)(l  —  n2/4)  ’  (4) 

Suppose  that  v  is  unknown  and  modeled  by  a  ran¬ 
dom  variable  V  distributed  by  a  mixture  of  beta  den¬ 
sities.  Specifically,  with  probability  1/3,  V  follows  a 
beta(2,2)  distribution  on  [0,1],  with  probability  1/3, 
V  follows  a  beta(2,2)  distribution  on  [1,2],  and  with 
probability  1/3,  V  follows  a  beta(2,2)  distribution 
on  [2,3].  We  seek  to  estimate  the  density  of  u2o.  Of 
course,  the  distribution  of  u2o  is  therefore  also  a  mix¬ 
ture  and  a  prior  insight  about  this  may  help  the  con¬ 
struction  of  exponential  epi-spline  estimates.  How¬ 
ever,  we  don’t  pursue  that  here.  While  one  could  at¬ 
tempt  UQ  by  means  of  polynomial  expansion,  the 
choice  of  polynomial  is  nontrivial  as  the  mixture  of 
beta  distributions  isn’t  a  standard  Askey  distribution. 
More  importantly,  further  difficulties  arise  since  (4)  is 
discontinuous  at  the  natural  frequencies  1  and  2. 

Using  a  sample  of  size  100,  we  obtain  the  expo¬ 
nential  epi-spline  estimate  in  Figure  3;  see  the  solid 
curve.  Here  we  include  the  soft  information  that  the 
density  is  unimodal  on  the  upper  30%  of  the  support 
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Figure  3:  Density  of  second-node  displacement  magnitude  in 
Example  2. 


Figure  4:  Density  of  second-node  displacement  magnitude  in 
Example  2  under  gradient  information. 

and  likewise  for  the  lower  30%.  For  comparison,  we 
use  a  sample  of  size  10'  to  obtain  a  highly  accurate 
estimate  of  the  true  density,  which  is  illustrated  by  the 
dotted  curve.  As  can  be  expected  from  the  form  of  g, 
the  true  density  is  highly  varying  with  a  discontinuity 
at  approximately  —1.78.  Even  in  the  presence  of  the 
incorrect  soft  information  about  continuous  differen¬ 
tiability,  we  find  that  the  exponential  epi-spline  esti¬ 
mate  captures  the  essence  of  the  variability  in  u2o-  In 
comparison,  the  kernel  estimate  (see  the  dashed  curve 
in  Figure  3)  poorly  represents  the  true  density. 

Figure  4  presents  similar  results  but  for  a  smaller 
sample  of  only  20  draws,  but  now  with  also  gradient 
information  at  those  20  points  utilized  in  (2).  We  also 
extend  the  unimodality  constraint  to  40%  to  achieve 
less  volatile  tails.  The  gradient  information  requires 
additional  flexibility  in  the  epi-spline  and  we  increase 
the  number  of  partitions  N  to  100.  We  again  see  that 
the  exponential  epi-spline  captures  the  essence  of  the 
true  density  even  with  the  low  sample  size. 


Figure  5:  Density  of  Sobol  function  output  in  Example  3. 


4.3  Example  3:  Sobol  function 

In  this  example,  we  use  the  Sobol  function 


s(v) = n 

3= 1 


|4  Vj  —  2|  +  cij 

1  <ij 


(5) 


with  Vj  being  independent  and  uniformly  distributed 
on  [0, 1]  and  aj  >  0.  As  in  Eldred  et  al.  (201 1),  we  use 
rn  —  5  and  a  =  (0, 1,2,4, 8).  Using  10  sample  points, 
we  obtain  the  exponential  epi-spline  estimate  in  Fig¬ 
ure  5  (see  the  solid  curve).  We  assume  that  the  support 
is  nonnegative,  the  density  is  unimodal,  and  the  point- 
wise  Fisher  information  is  in  the  interval  [—1, 0].  We 
use  N  =  20.  We  see  that  our  estimate  follows  the  true 
density  (estimated  using  a  sample  of  size  107;  see  the 
dotted  curve)  quite  well,  especially  in  the  upper  tail. 
In  contrast,  the  kernel  estimate  oscillates  near  zero. 
This  is  also  a  case  where  there  are  difficulties  in  ob¬ 
taining  accurate  polynomial  expansions  as  g  is  nons¬ 
mooth. 


4.4  Example  4:  Aeroshell  Dynamics 

This  example  is  taken  from  Swiler  et  al.  (2009)  and 
deals  with  a  bonding  material  in  an  aeroshell.  A  3- 
D  finite-element  model  gives  the  frequency  in  shear 
mode,  which  is  our  main  concern.  The  frequency  is 
uncertain  due  to  unknown  Young’s  modulus  E  and 
Poissons  ratio  v  in  a  material  component.  Each  evalu¬ 
ation  of  the  finite-element  model  takes  about  2  hours 
and  few  values  of  E  and  v  can  be  examined.  We  con¬ 
sider  a  data  set  of  10  frequencies  corresponding  to 
various  values  of  E  and  v  (see  Swiler  et  al.  2009)  and 
aim  to  estimate  the  density  of  the  frequency. 

Figure  6  shows  the  estimated  density  (solid  curve) 
given  unimodality  and  support  bounds  [813, 2884]  de¬ 
duced  from  experience  with  the  system.  We  use  N  = 
20.  The  sample  is  illustrated  by  circles.  We  find  that 
the  exponential  epi-spline  estimate  gives  a  reasonable 
representation  of  the  uncertainty,  while,  in  contrast, 
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Figure  6:  Density  of  shear  mode  frequency  in  Example  4. 

the  kernel  estimate  (dashed  curve)  exhibits  oscilla¬ 
tions  that  can’t  be  expected  for  this  system. 

5  CONCLUSIONS 

We  present  a  novel  approach  to  UQ  based  on  prob¬ 
ability  density  estimation  for  an  output  quantity 
of  interest.  We  are  able  to  achieve  good-quality 
estimates  of  such  densities  even  for  small  data  sets 
by  including  soft  information  that  may  be  derived 
from  engineering  insight  and  judgement.  With  as 
little  as  10-100  realizations  of  the  output,  we  are  able 
to  identify  the  main  features  of  the  densities  even  for 
nonsmooth  and  discontinuous  system  function  and 
high-dimensional  inputs.  The  densities  can  lead  to 
new  understanding  of  the  system  uncertainty  and  pro¬ 
vide  means  for  estimating  moments  and  resampling 
as  may  be  needed  in  further  analysis  and  optimization. 
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